-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: add a github bot to support advanced PR review workflows #3037
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3037 +/- ##
==========================================
+ Coverage 63.77% 63.79% +0.01%
==========================================
Files 548 573 +25
Lines 78681 79587 +906
==========================================
+ Hits 50180 50773 +593
- Misses 25117 25407 +290
- Partials 3384 3407 +23
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@aeddi would you like a preliminary review? |
8da569e
to
4a3e93d
Compare
could you please add a brief description to give people an early preview, even if it's draft? |
@thehowl Sorry for the delayed response. I hadn't seen your message. I still have some cleanup to do and need to figure out how to write some tests, but I think the PR is pre-reviewable now. @ltzmaxwell Done ;) |
7b63f89
to
1a34f18
Compare
1a34f18
to
7aa06b8
Compare
I haven't looked into this in depth yet, but I'm considering whether we could introduce an LLM, like ChatGPT, to assist with code review. It could help summarize PR modifications and potentially identify issues in the code. This is outside the current scope, but I wanted to leave some thoughts here for discussion. |
Yes, why not have this kind of tool just to detect details we might have missed? Just quickly checking, I see there are Actions on the Github marketplace dedicated to this. |
3f419b8
to
f85b439
Compare
f85b439
to
78268ac
Compare
When do you think we should start using it and providing feedback for iteration? |
@moul As soon as it's reviewed and merged, I think we can start by setting up a few simple rules to make sure everything is working well, then iterate from there. BTW, the failing check is because the bot workflow added to this PR is trying to run while the repo isn't configured yet (specifically, the workflow requires setting a GitHub Actions secret and variable). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this way of implementing a bot is not secure. Giving random access to code execution to anyone creating a PR is not a good idea IMO.
- assigned | ||
- unassigned | ||
- labeled | ||
- unlabeled | ||
- opened | ||
- reopened | ||
- synchronize # PR head updated | ||
- converted_to_draft | ||
- ready_for_review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All existing types for reference:
- assigned
- unassigned
- labeled
- unlabeled
- opened
- edited
- closed
- reopened
- synchronize
- converted_to_draft
- locked
- unlocked
- enqueued
- dequeued
- milestoned
- demilestoned
- ready_for_review
- review_requested
- review_request_removed
- auto_merge_enabled
- auto_merge_disabled
Why don't just run on all types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only included the ones that were used by the existing conditions/requirements of the bot. It seems better to me than subscribing to all possible events for several reasons :
- It makes it clearer which events the bot uses
- in term of security, it follows the PoLP (even though a trigger is not exactly a privilege)
- it prevents the workflow being triggered and launching two runners for nothing
With the idea that if we added, for example, a condition/requirement based on adding to/removing from a milestone, we would just add the corresponding trigger event to the workflow.
|
||
# Watch for changes on PR comment | ||
issue_comment: | ||
types: [created, edited, deleted] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same, why don't just run on all types and let the bot decide what to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
go-version-file: go.mod | ||
|
||
- name: Run GitHub Bot | ||
working-directory: contribs/github-bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More than a bot is a script, right? I was expecting a server handling events through the GitHub API with a state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we defined the specifications with @moul and @thehowl (I can't remember who else was on the call we had in Turin about it), one of the requirements was that the bot shouldn't run on a server but that everything should be processed through GitHub Actions.
It makes things more complicated to test and the manual checks more complicated to implement using the comment body as their state instead of a DB or similar, but it has the advantage of having a piece of code that runs in a serverless way.
But if the name is misleading, we can rename it.
# This job creates a matrix of PR numbers based on the inputs from the various | ||
# events that can trigger this workflow so that the process-pr job below can | ||
# handle the parallel processing of the pull-requests | ||
define-prs-matrix: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand this job. Do you mind to explain it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sorry if it's not clear enough, so there are two constraints:
- First, we need to prevent multiple runners from processing the same PR at the same time (to avoid any bug)
- Second, there are different events that can trigger this workflow, and not all of them provide the PR number(s) to process the same way.
So the goal of this job (define-prs-matrix) is first to establish a list of PR numbers to process from the different possible events, then to format this list as a matrix so that the next job (process-pr) can launch a runner for each PR number in the matrix while ensuring (through the 'concurrency' feature of GitHub Actions) that no PR is processed concurrently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aeddi some additional questions here:
- Is there a scenario where one PR can block another PR? Or concurrency only happen when manually triggered?
- How much time does a job usually take to complete? If one PR can block another PR and take 10 minutes (e.g., long process, network issue, ...) to complete, it might start to be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The concurrency parameter of the job is really set to prevent a PR X from being processed by two runners in parallel. For example, if PR 1111 is already being processed by one runner and a new event (any event, manual or not) triggers the job to also process PR 1111, the new runner will be queued and will wait for the first runner to finish execution before starting the job. However, the processing of PR 2222 can occur without any issues in parallel with that of 1111.
- Never saw a run last longer than 1 min, on average the execution time is around 40 secs (and one PR can't block another one anyway ;) )
I'm not sur to understand what differ from any other action. I mean to run code when opening a PR or any other GitHub event, we already have more than 130 checks that run for each PR we open. Also in the case of this bot, we own its code in our repo and have total control over it. Which looks more secure to me than using third-party actions. Still following the PoLP, the only permissions the bot requires right now are those listed in the README, which are not very sensitive IMO. The write permissions are for the scopes Pull Requests (managing comments, labels, milestones, etc.) and Commit Statuses (showing a check at the bottom of the PR). It can't change any code on the repo, for instance. Do you have another approach to suggest or specific guideline to follow? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall LGTM, my only requirement is to get ride of this // nolint:govet
and fix cancel leaking.
I'm not sure about @ajnavarro's concern; as long as the token is correctly scoped, it should be fine IMO.
To me, the main issue is that if the PR comes from a fork, the workflow could not be able to grab the bot secret and therefore cannot access the token. Can you (have you?) check the scenario where you trigger this workflow from a branch of an external fork?
With the exception of GITHUB_TOKEN, secrets are not passed to the runner when a workflow is triggered from a forked repository.
contribs/github-bot/main.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for consistency's sake among the project, I would prefer that you use ffcli
here. You can check other main files in the project; most of them are using the internal command
package, which is actually a fork of ffcli
.
# This job creates a matrix of PR numbers based on the inputs from the various | ||
# events that can trigger this workflow so that the process-pr job below can | ||
# handle the parallel processing of the pull-requests | ||
define-prs-matrix: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aeddi some additional questions here:
- Is there a scenario where one PR can block another PR? Or concurrency only happen when manually triggered?
- How much time does a job usually take to complete? If one PR can block another PR and take 10 minutes (e.g., long process, network issue, ...) to complete, it might start to be a problem.
About the PR opened from a fork : I remember we discussed it together, but I totally forgot to take it into account. The workaround is to use the In this context, GitHub allows the workflow to access to GitHub Actions secrets and also allows I just tested it on the demo repo, and it works. I will make the correction, thanks for the reminder. Edit : done fb598ee |
This pull request aims to add a bot that extends GitHub's functionalities like codeowners file and other merge protection mechanisms. Interaction with the bot is done via a comment. You can test it on the demo repo here : GnoCheckBot/demo#1
Fixes #1007
Related to #1466, #2788
config.go
file contains all the conditions and requirements in an 'If - Then' format.condition
folder.requirement
folder).To do
Contributors' checklist...
BREAKING CHANGE: xxx
message was included in the description